Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification
نویسندگان
چکیده
Batch normalization (BN) has become a de facto standard for training deep convolutional networks. However, BN accounts for a significant fraction of training run-time and is difficult to accelerate, since it is memory-bandwidth bounded. Such a drawback of BN motivates us to explore recently proposed weight normalization algorithms (WN algorithms), i.e. weight normalization, normalization propagation, and weight normalization with translated ReLU. These algorithms don’t slow-down training iterations and were experimentally shown to outperform BN on relatively small networks and datasets. However, it is not clear if these algorithms could replace BN in large-scale applications. We answer this question by providing a detailed comparison of BN and WN algorithms using ResNet-50 network trained on ImageNet. We found that although WN achieves better training accuracy, the final test accuracy is significantly lower (≈ 6%) than that of BN. This result demonstrates the surprising strength of the BN regularization effect which we were unable to compensate using standard regularization techniques like dropout and weight decay. We also found that training with WN algorithms is significantly less stable compared to BN, limiting their practical applications.
منابع مشابه
Comparison of Count Normalization Methods for Statistical Parametric Mapping Analysis Using a Digital Brain Phantom Obtained from Fluorodeoxyglucose-positron Emission Tomography
Objective(s): Alternative normalization methods were proposed to solve the biased information of SPM in the study of neurodegenerative disease. The objective of this study was to determine the most suitable count normalization method for SPM analysis of a neurodegenerative disease based on the results of different count normalization methods applied on a prepared digital phantom similar to one ...
متن کاملComputerize classification of Benign and malignant thyroid nodules by ultrasound imaging
Introduction: Early detection and treatment of thyroid nodules increase the cure rate and provide optimal treatment. Ultrasound is the chosen imaging technique for assessment of thyroid nodules. Confirmation of the diagnosis usually demands repeated fine needle aspiration biopsy (FNAB). So, current management, has morbidity and non zero mortality. The goal of the present study ...
متن کاملExtending Network Normalization Schemes
Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better models. However its success has been very limited when dealing with recurrent neural networks. On the other hand, layer normalization normalizes the activations ac...
متن کاملNormalizing the Normalizers: Comparing and Extending Network Normalization Schemes
Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better models. However its success has been very limited when dealing with recurrent neural networks. On the other hand, layer normalization normalizes the activations ac...
متن کاملNormalization of Parents’ Response to Children’s Positive Emotions Scale
Abstract This study evaluated the normalization of the Persian version of the Parents’ Response to Children’s Positive Emotions Scale (PRCPS). For evaluating reliability and validity of this scale through random sampling, 400 mothers of 4-7-year-old children completed the PRCPS and Cognitive Emotion Regulation Questionnaire (CERQ). Evaluating internal reliability of PRCPS subscales by Cronba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1709.08145 شماره
صفحات -
تاریخ انتشار 2017